Towards Spanish Verbs' Selectional Preferences Automatic Acquisition: Semantic Annotation of the SenSem Corpus
نویسندگان
چکیده
We present the results of an agreement task carried out in the framework of the KNOW Project and consisting in manually annotating an agreement sample totaling 50 sentences extracted from the SenSem corpus. Diambiguation was carried out for all nouns, proper nouns and adjectives in the sample, all of which were assigned EuroWordNet (EWN) synsets. As a result of the task, Spanish WN has been shown to exhibit 1) lack of explanatory clarity (it does not define word meanings, but glosses and examplifies them instead; it does not systematically encode metaphoric meanings, either); 2) structural inadequacy (some words appear as hyponyms of another sense of the same word; sometimes there even coexist in Spanish WN a general sense and a specific one related to the same concept, but with no structural link in between; hyperonymy relationships have been detected that are likely to raise doubts to human annotators; there can even be found cases of auto-hyponymy); 3) cross-linguistic inconsistency (there exist in English EWN concepts whose lexical equivalent is missing in Spanish WN; glosses in one language more often than not contradict or diverge from glosses in another language).
منابع مشابه
Semantic Hand-Tagging of the SenSem Corpus Using Spanish WordNet Senses
This paper presents the semantic annotation of the SenSem Spanish corpus, a research focused on the semantic annotation of the nominal heads of the verbal arguments, with the final goal of acquiring semantic preferences for verb senses. We used Spanish WordNet 1.6 senses in the annotation process. This process involves the analysis of the adequacy of WordNet for semantic annotation and, in case...
متن کاملEnriching a lexical semantic net with selectional preferences by means of statistical corpus analysis
Broad-coverage ontologies which represent lexical semantic knowledge are being built for more and more natural languages. Such resources provide very useful information for word sense disambiguation, which is crucial for a variety of NLP tasks (e.g. semantic annotation of corpora, information retrieval, or semantic inferencing). Since the manual encoding of such ontologies is very labour-intens...
متن کاملAnotación semántica de los sustantivos del corpus SenSem
The main goal of this project is the semantic annotation of argument nouns of SenSem corpus with synsets of WordNet. The final objective of research is the acquisition of semantic preferences.
متن کاملThe Sensem Corpus: a Corpus Annotated at the Syntactic and Semantic Level
The primary aim of the project SENSEM (Sentence Semantics, BFF2003-06456) is the construction of a Lexical Data Base illustrating the syntactic and semantic behavior of each of the senses of the 250 most frequent verbs of Spanish. With this objective in mind, we are currently building an annotated corpus consisting of sentences extracted from the electronic version of the newspaper El Periódico...
متن کاملAutomatic Selectional Preference Acquisition for Latin Verbs
We present a system that automatically induces Selectional Preferences (SPs) for Latin verbs from two treebanks by using Latin WordNet. Our method overcomes some of the problems connected with data sparseness and the small size of the input corpora. We also suggest a way to evaluate the acquired SPs on unseen events extracted from other Latin corpora.
متن کامل